This report presents a comprehensive analysis of Luxembourg’s research publication landscape using data extracted from OpenAlex, an open-access scholarly metadata platform. OpenAlex serves as a freely accessible alternative to proprietary academic databases, providing structured information about publications, authors, institutional affiliations, and research collaborations across all academic disciplines.
The data collection methodology focuses on identifying all scholarly works with Luxembourg institutional affiliations recorded in OpenAlex over the past decade. This approach captures both research led by Luxembourg-based scholars and international collaborative projects where Luxembourg institutions participate as co-authors. The temporal scope provides a current snapshot of the country’s scientific output and evolving research partnerships.
OpenAlex aggregates metadata from multiple sources including institutional repositories, publisher databases, and citation networks. While this multi-source approach enhances coverage comprehensiveness, data quality depends on the accuracy of source reporting and the platform’s ability to correctly identify and link Luxembourg-affiliated works. These methodological considerations should be kept in mind when interpreting the analytical findings presented throughout this report.
2 Data Structure and Document Types
The initial dataset encompasses all document types recorded in OpenAlex for Luxembourg-affiliated scholarly works during the study period. The following table displays the distribution of work types and the presence of Digital Object Identifiers (DOIs), which serve as persistent identifiers linking to original publications:
Table 1: Distribution of types of work and missingness of DOI
Based on the substantial predominance of journal articles in the dataset and their central importance in academic research communication, this analysis restricts its focus to articles exclusively. This selection encompasses both publications with DOI identifiers and those without, ensuring comprehensive coverage of Luxembourg’s peer-reviewed research output.
The analytical framework employs a critical distinction based on first authorship status: whether the primary author maintains affiliation with a Luxembourg institution or represents an international collaboration where Luxembourg institutions participate as secondary contributors. This classification enables differentiation between research leadership and research participation within the national research ecosystem:
Table 2: Number of articles where the first author is LU-affiliated
3 Research Domain Analysis
The following visualization examines the temporal distribution of Luxembourg-affiliated research across major scientific domains. The data utilizes OpenAlex’s domain classification system, which categorizes research fields into broad disciplinary areas. The analysis tracks publication volumes over time while maintaining the distinction between Luxembourg-led research (first author affiliation) and collaborative research (non-first author participation):
The analysis extends to examining Luxembourg’s research collaboration patterns with international partners. The dataset contains information about co-author affiliations, enabling identification of the most frequent collaborating countries and regions. This section presents the geographical distribution of research partnerships, organized by publication year and distinguished by Luxembourg’s role as lead author versus collaborative partner.
The data processing groups countries into meaningful categories, including major individual nations, regional blocs, and an aggregated “Others” category for countries with lower collaboration frequencies. This approach provides clarity while maintaining analytical depth regarding Luxembourg’s primary research partnerships:
Source Code
---title: "Research Publications Affiliated with Luxembourg: A Data-Driven Analysis"---```{r}#| include: falselibrary(DT)library(crosstalk)library(dplyr)library(ggplot2)library(rixpress)library(tidyr)rxp_load("type_doi_missing")rxp_load("lu_first_authors")rxp_load("primary_domain_lu")rxp_load("country_authors_unique")```## IntroductionThis report presents a comprehensive analysis of Luxembourg's researchpublication landscape using data extracted from OpenAlex, an open-accessscholarly metadata platform. OpenAlex serves as a freely accessible alternativeto proprietary academic databases, providing structured information aboutpublications, authors, institutional affiliations, and research collaborationsacross all academic disciplines.The data collection methodology focuses on identifying all scholarly works withLuxembourg institutional affiliations recorded in OpenAlex over the past decade.This approach captures both research led by Luxembourg-based scholars andinternational collaborative projects where Luxembourg institutions participateas co-authors. The temporal scope provides a current snapshot of the country'sscientific output and evolving research partnerships.OpenAlex aggregates metadata from multiple sources including institutionalrepositories, publisher databases, and citation networks. While thismulti-source approach enhances coverage comprehensiveness, data quality dependson the accuracy of source reporting and the platform's ability to correctlyidentify and link Luxembourg-affiliated works. These methodologicalconsiderations should be kept in mind when interpreting the analytical findingspresented throughout this report.## Data Structure and Document TypesThe initial dataset encompasses all document types recorded in OpenAlex forLuxembourg-affiliated scholarly works during the study period. The followingtable displays the distribution of work types and the presence of Digital ObjectIdentifiers (DOIs), which serve as persistent identifiers linking to originalpublications:```{r}#| echo: false#| label: tbl-type-doi-missing#| tbl-cap: Distribution of types of work and missingness of DOIdatatable( type_doi_missing,caption ="Distribution of types of work and missingness of DOI",options =list(pageLength =15,scrollX =TRUE,dom ='Bfrtip',buttons =c('excel') ),extensions ='Buttons')```Based on the substantial predominance of journal articles in the dataset andtheir central importance in academic research communication, this analysisrestricts its focus to articles exclusively. This selection encompasses bothpublications with DOI identifiers and those without, ensuring comprehensivecoverage of Luxembourg's peer-reviewed research output.The analytical framework employs a critical distinction based on first authorship status: whether the primary author maintains affiliation with a Luxembourg institution or represents an international collaboration where Luxembourg institutions participate as secondary contributors. This classification enables differentiation between research leadership and research participation within the national research ecosystem:```{r}#| echo: false#| label: tbl-lu-first-authors#| tbl-cap: Number of articles where the first author is LU-affiliatedlu_first_authors <- lu_first_authors %>%pivot_wider(names_from = is_lu_first_author, values_from = total) %>%rename(`Publication Year`= publication_year,`Non-LU first author`=`FALSE`,`LU first author`=`TRUE`)lu_first_authors_shared <- SharedData$new(lu_first_authors)# Year filter widgetfilter_slider(id ="publication_year_filter", label ="Select Year Range:",sharedData = lu_first_authors_shared, column ="Publication Year",step =1, round =TRUE, width ="100%")datatable( lu_first_authors_shared,filter ="top",caption ="Number of articles where the first author is LU-affiliated",options =list(pageLength =10,scrollX =TRUE,dom ='Bfrtip',buttons =c('excel'),order =list(list(0, 'desc')) ),extensions ='Buttons') %>%formatStyle(columns =colnames(.), fontsize ='12px')```## Research Domain AnalysisThe following visualization examines the temporal distribution ofLuxembourg-affiliated research across major scientific domains. The datautilizes OpenAlex's domain classification system, which categorizes researchfields into broad disciplinary areas. The analysis tracks publication volumesover time while maintaining the distinction between Luxembourg-led research(first author affiliation) and collaborative research (non-first authorparticipation):```{r}#| echo: false#| fig-height: 6#| fig-width: 12domain_colors <-c(# Add your actual domain names here with distinctive colors"Health Sciences"="#FF6B35", # Orange-red"Life Sciences"="#003399", # Deep blue"MISSING-DOMAIN"="#228B22", # Forest green"Physical Sciences"="#FF1493", # Deep pink"Social Sciences"="#800080"# Purple)primary_domain_lu %>%mutate(is_lu_first_author =ifelse(is_lu_first_author, "LU-affiliated first author", "Non LU-affiliated first author")) %>%ggplot(aes(x = publication_year,y = total, color = primary_domain_name,group = primary_domain_name, )) +geom_line(linewidth =1.2, alpha =0.8) +geom_point(size =2, alpha =0.9) +scale_color_manual(values = domain_colors) +facet_wrap(~is_lu_first_author,labeller =labeller(is_lu_first_author =c("LU-affiliated first author"="LU-affiliated first author","Non LU-affiliated first author"="Non LU-affiliated first author" ) ) ) +labs(title ="Luxembourg Research Publications by Domain Over Time",subtitle ="Faceted by Luxembourg First Author Status",x ="Publication Year",y ="Number of Publications",color ="Research Domain" ) +theme_minimal() +theme(axis.text.x =element_text(size =10),plot.title =element_text(size =14, face ="bold"),plot.subtitle =element_text(size =12),legend.position ="bottom",legend.title =element_text(face ="bold"),strip.text =element_text(face ="bold", size =11),strip.background =element_rect(fill ="lightgray", color ="black"),axis.title =element_text(size =12),panel.grid.major =element_line(color ="lightgray", linewidth =0.3),panel.grid.minor =element_blank() ) +# Adjust legend to show in multiple rows if neededguides(color =guide_legend(nrow =2, byrow =TRUE))```Observable js version of the graph:```{r}ojs_define(primary_domain_lu_ojs = primary_domain_lu)``````{ojs}//| echo: false// Import required librariesimport {Plot} from"@observablehq/plot"import {Inputs} from"@observablehq/inputs"import {rangeInput} from"@mootari/range-slider"// Convert R data to JavaScript formatraw_data =transpose(primary_domain_lu_ojs)// Convert data types to ensure Observable Plot can use themdata = raw_data.map(d => ({publication_year:+d.publication_year,// Convert to numberprimary_domain_name: d.primary_domain_name,total:+d.total,// Convert to numberis_lu_first_author: d.is_lu_first_author==="TRUE"|| d.is_lu_first_author===true// Convert to boolean}))// Create color mappingdomain_colors =newMap([ ["Health Sciences","#FF6B35"], ["Life Sciences","#003399"], ["MISSING-DOMAIN","#228B22"], ["Physical Sciences","#FF1493"], ["Social Sciences","#800080"]])// Get unique values for controlsunique_years = [...newSet(data.map(d => d.publication_year))].sort()unique_domains = [...newSet(data.map(d => d.primary_domain_name))].sort()viewof year_range =rangeInput({min:2015,max:2025,value: [2015,2025],title:'Select year interval',step:1})viewof selected_domains = Inputs.checkbox( unique_domains, {value: unique_domains,label:"Research Domains" })viewof selected_author_type = Inputs.radio( ["LU-affiliated first author","Non LU-affiliated first author","Both"], {value:"Both",label:"First Author Affiliation" })// Filter data based on selectionsfiltered_data = data.filter(d => {// Year filterconst yearOk = d.publication_year>= year_range[0] && d.publication_year<= year_range[1];// Domain filterconst domainOk = selected_domains.includes(d.primary_domain_name);// Author filterlet authorOk =false;if (selected_author_type ==="Both") { authorOk =true; } elseif (selected_author_type ==="LU-affiliated first author") { authorOk = d.is_lu_first_author===true; } elseif (selected_author_type ==="Non LU-affiliated first author") { authorOk = d.is_lu_first_author===false; }return yearOk && domainOk && authorOk;})// Create the plot//| echo: false// Create the plot with better visual distinction// Create the plotPlot.plot({width:800,height:500,marginLeft:60,marginBottom:60,x: {label:"Publication Year",domain: [Math.min(...unique_years),Math.max(...unique_years)],tickFormat:"d" },y: {label:"Number of Publications",grid:true },color: {domain: unique_domains,range: unique_domains.map(d => domain_colors.get(d)),legend:true },marks: [// Solid lines for LU-affiliated authors Plot.line(filtered_data.filter(d => d.is_lu_first_author===true), {x:"publication_year",y:"total",stroke:"primary_domain_name",strokeWidth:2.5,z: d =>`${d.primary_domain_name}-LU` }),// Dashed lines for non-LU-affiliated authors Plot.line(filtered_data.filter(d => d.is_lu_first_author===false), {x:"publication_year",y:"total",stroke:"primary_domain_name",strokeWidth:2.5,strokeDasharray:"8,8",z: d =>`${d.primary_domain_name}-NonLU` }),// Points with different symbols Plot.dot(filtered_data, {x:"publication_year",y:"total",fill:"primary_domain_name",symbol: d => d.is_lu_first_author?"circle":"triangle",r:5,stroke:"white",strokeWidth:1 }),// Tooltips Plot.tip(filtered_data, Plot.pointer({x:"publication_year",y:"total",fill:"primary_domain_name",title: d =>`${d.primary_domain_name}\n${d.publication_year}\nPublications: ${d.total}\nLU First Author: ${d.is_lu_first_author?"Yes":"No"}` })) ],title:"Luxembourg Research Publications by Domain Over Time"})//| echo: false// Author Type Legendhtml`<div style="margin-top: 20px; padding: 15px; border: 1px solid #ddd; border-radius: 5px; background-color: #f9f9f9;"> <h4 style="margin-top: 0; margin-bottom: 10px;">Author Type Legend:</h4> <div style="display: flex; gap: 30px; align-items: center;"> <div style="display: flex; align-items: center; gap: 8px;"> <svg width="30" height="20"> <line x1="0" y1="10" x2="30" y2="10" stroke="#333" stroke-width="2.5" /> <circle cx="15" cy="10" r="4" fill="#333" stroke="white" stroke-width="1"/> </svg> <span><strong>LU-affiliated first author</strong> (solid line + circle)</span> </div> <div style="display: flex; align-items: center; gap: 8px;"> <svg width="30" height="20"> <line x1="0" y1="10" x2="30" y2="10" stroke="#333" stroke-width="2.5" stroke-dasharray="2,8"/> <polygon points="15,6 19,14 11,14" fill="#333" stroke="white" stroke-width="1"/> </svg> <span><strong>Non-LU-affiliated first author</strong> (dashed line + triangle)</span> </div> </div></div>````## International Collaboration PatternsThe analysis extends to examining Luxembourg's research collaboration patternswith international partners. The dataset contains information about co-authoraffiliations, enabling identification of the most frequent collaboratingcountries and regions. This section presents the geographical distribution ofresearch partnerships, organized by publication year and distinguished byLuxembourg's role as lead author versus collaborative partner.The data processing groups countries into meaningful categories, including majorindividual nations, regional blocs, and an aggregated "Others" category forcountries with lower collaboration frequencies. This approach provides claritywhile maintaining analytical depth regarding Luxembourg's primary researchpartnerships:```{r}#| echo: false#| warning: false#| fig-height: 14#| fig-width: 12country_colors <-c("European Union"="#003399", # Deep blue (EU flag)"Others"="#FF1493", # Deep pink"Luxembourg"="#FF6B35", # Orange-red"France"="#800080", # Purple"USA"="#228B22", # Forest green"Belgium"="#FFD700", # Gold"China"="#DE2910", # Red"Great Britain"="#C8102E", # British red (Union Jack)"Italy"="#009246", # Italian green (flag)"Spain"="#AA151B", # Spanish red (flag)"Switzerland"="#FF0000", # Swiss red (flag)"Netherlands"="#FF7F00"# Dutch orange (national color))# Plot for publication with co-authorsggplot(country_authors_unique, aes(x = publication_year, y = n, fill = country_groups)) +geom_col(position ="dodge", color ="black", size =0.5) +# Dodge bars for multiple countries per yearscale_fill_manual(values = country_colors) +facet_wrap(~is_lu_first_author,ncol =1,labeller =labeller(is_lu_first_author =c("FALSE"="LU Not First Author","TRUE"="LU First Author" ) ) ) +labs(title ="Publications by Country/Region Over Time",subtitle ="Faceted by Luxembourg First Author Status",x ="Publication Year",y ="Total Publications",fill ="Country/Region" ) +theme_minimal() +theme(axis.text.x =element_text(size =10),plot.title =element_text(size =14, face ="bold"),plot.subtitle =element_text(size =12),legend.position ="bottom",legend.title =element_text(face ="bold"),strip.text =element_text(face ="bold", size =11),strip.background =element_rect(fill ="lightgray", color ="black") ) +# Adjust legend to show in multiple rows if neededguides(fill =guide_legend(nrow =2, byrow =TRUE))```